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1  Overview 


Three  central  goals  of  work  in  the  generalized  phrase  structure  grammar 
(GPSG)  linguistic  framework,  as  stated  in  the  leading  book  “Generalized 
Phrase  Structure  Grammar”  Gazdar  et  al  (1985)  (hereafter  GKPS),  are: 
(1)  to  characterize  all  and  only  the  natural  language  grammars,  (2)  to  al¬ 
gorithmically  determine  membership  and  generative  power  consequences  of 
GPSGs,  and  (3)  to  embody  the  universalism  of  natural  language  entirely  in 
tne  formal  system,  rather  than  by  statements  made  in  it.1 

These  pages  formally  consider  whether  GPSG’s  weak  context-free  gener¬ 
ative  power  (wcfgp)  will  allow  it  to  achieve  the  three  goals.  The  centerpiece 
of  this  paper  is  a  proof  that  it  is  undecidable  whether  an  arbitrary  GPSG 
generates  the  nonnatural  language  £*.  On  the  basis  of  this  result,  I  ar¬ 
gue  that  GPSG  fails  to  define  the  natural  language  grammars,  and  that 
the  generative  power  consequences  of  the  GPSG  framework  cannot  be  al¬ 
gorithmically  determined,  contrary  to  goals  one  and  two.2  In  the  process, 
I  examine  the  linguistic  universalism  of  the  GPSG  formal  system  and  ar¬ 
gue  that  GPSGs  can  describe  an  infinite  class  of  nonnatural  context-free 
languages.  The  paper  concludes  with  a  brief  diagnosis  of  the  result  and  sug¬ 
gests  that  the  problem  might  be  met  by  abandoning  the  weak  context-free 
generative  power  framework  and  assuming  substantive  constraints. 

'GKPS  clearly  outline  their  goals.  One,  “to  arrive  at  a  constrained  metalanguage 
capable  of  defining  the  grammars  of  natural  languages,  but  not  the  grammar  of,  say,  the 
set  of  prime  numbers."(p.4).  Two,  to  construct  an  explicit  linguistic  theory  whose  formal 
consequences  are  clearly  and  easily  determinable.  These  'formal  consequences’  include 
both  the  generative  power  consequences  demanded  by  the  first  goal  and  membership 
determination:  GPSG  regards  languages  “as  collections  whose  membership  is  definitely 
and  precisely  specifiable.”(p.l)  Three,  to  define  a  linguistic  theory  where  * the  universalism 
I of  natural  language /  is,  ultimately,  intended  to  be  entirely  embodied  in  the  formal  tyitem, 
not  ezprested  by  ttatementi  made  in  »f.”(p.4,  my  emphasis) 

2The  proof  technique  make  use  of  invalid  computations,  and  the  actual  GPSG  con¬ 
structed  is  so  simple,  so  similar  to  the  GPSGs  proposed  for  actual  natural  languages, 
and  so  flexible  in  its  exact  formulation  that  the  method  of  proof  suggests  there  may  be  no 
simple  reformulations  of  GPSG  that  avoid  this  problem.  The  proof  also  suggests  that  it  is 
impossible  in  principle  to  algorithmically  determine  whether  linguistic  theories  based  on 
a  wcfgp  framework  (e.g.  GPSG)  actually  define  the  natural  language  grammars. 
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1.1  The  Structure  of  GPSG  Theory 

A  generalized  phrase  structure  grammar  contains  five  language-particular 
components  (immediate  dominance  (ID)  rules,  metarules,  linear  precedence 
(LP)  statements,  feature  co-occurrence  restrictions  (FCRs),  and  feature 
specification  defaults  (FSDs))  and  four  universal  components:  a  theory  of 
syntactic  features,  principles  of  universal  feature  instantiation,  principles  of 
semantic  interpretation,  and  formal  relationships  among  various  components 
of  the  grammar.3 

The  set  of  ID  rules  obtained  by  taking  the  finite  closure  of  the  metarules 
on  the  ID  rules  is  mapped  into  local  phrase  structure  trees,  subject  to  prin¬ 
ciples  of  universal  feature  instantiation,  FSDs,  FCRs,  and  LP  statements. 
Finally,  these  local  trees  are  assembled  to  form  phrase  structure  trees,  which 
are  terminated  by  lexical  elements. 

The  essence  of  GPSG  is  the  constrained  mapping  of  ID  rules  into  local 
trees.  The  constraints  of  GPSG  theory  subdivide  into  absolute  constraints 
on  local  trees  (due  to  FCRs  and  LP-statements)  and  relative  constraints  on 
the  rule  to  local  tree  mapping  (stemming  from  FSDs  and  universal  feature 
instantiation).  The  absolute  constraints  are  all  language-particular,  and 
consequently  not  inherent  in  the  formal  GPSG  framework.  Similarly,  the 
relative  constraints,  of  which  only  universal  instantiation  is  not  explicitly 
language-particular,  do  not  apply  to  fully  specified  ID  rules  and  consequently 
are  not  strongly  inherent  in  the  GPSG  framework  either.4  In  summary, 
GPSG  local  trees  are  only  as  constrained  as  ID  rules  are:  that  is,  not  at  all. 

The  only  constraint  strongly  inherent  in  GPSG  theory  (when  compared 
to  context-free  grammars  (CFGs))  is  finite  feature  closure,  which  limits  the 
number  of  GPSG  nonterminal  symbols  to  be  finite  and  bounded.6 

*This  work  is  based  on  current  GPSG  theory  as  presented  in  GKPS.  The  reader  is 
urged  to  consult  that  work  for  a  formal  presentation  and  thorough  exposition  of  current 
GPSG  theory. 

4I  use  “strongly  inherent”  to  mean  “unavoidable  by  virtue  of  the  formal  framework.” 
Note  that  the  use  of  problematic  feature  specifications  in  universal  feature  instantiation 
means  that  this  constraint  is  dependent  on  other,  parochial,  components  (e.g.  FCRs). 
Appropriate  choice  of  FCRs  or  ID  rules  will  abrogate  universal  feature  instantiation,  thus 
rendering  it  implicitly  language  particular  too. 

*This  formal  constraint  is  extremely  weak,  however,  since  the  theory  of  syntactic  fea¬ 
tures  licenses  more  than  10T74  syntactic  categories.  See  Ristad(1986)  for  a  discussion. 
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1.2  A  Nonnatural  GPSG 

Consider  the  exceedingly  simple  GPSG  for  the  nonnatural  language  £*, 
consisting  solely  of  the  two  ID  rules 

This  GPSG  generates  local  trees  with  all  possible  subcategorization  spec¬ 
ifications  —  the  SUBCAT  feature  may  assume  any  value  in  the  non-head 
daughter  of  the  first  ID  rule,  and  S  generates  the  nonnatural  language  £*. 

This  exhibit  is  inconclusive,  however.  We  have  only  shown  that  GKPS 
—  and  not  GPSG  —  have  failed  to  achieve  the  first  goal  of  GPSG  theory. 
The  exhibition  leaves  open  the  possibility  of  trivially  reformalizing  GPSG 
or  imposing  ad-hoc  constraints  on  the  theory  such  that  I  will  no  longer  be 
able  to  personally  construct  a  GPSG  for  E*. 


2  Undecidability  and  Generative  Power  in  GPSG 

That  “=  E*?”  is  undecidable  for  arbitrary  context-free  grammars  is  a  well- 
known  result  in  the  formal  language  literature  (see  Hopcraft  and  Ullman(1979:201- 
203)).  The  standard  proof  is  to  construct  a  pushdown  automata  (PDA)  that 
accepts  all  invalid  computations  of  a  Turing  machine  (TM)  M.  From  this 
PDA  an  equivalent  CFG  G  is  directly  constructible.  Thus,  L{G)  —  E*  if 
and  only  if  all  computations  oj  M  arc  invalid,  i.e.  L(M)  —  0.  The  latter 
problem  is  undecidable,  so  the  former  must  be  also. 

No  such  reduction  is  possible  for  a  proof  that  “=  £*?”  is  undecidable 
for  arbitrary  GPSGs.  In  the  above  reduction,  the  number  of  nonterminals 
in  G  is  a  function  of  the  size  of  the  simulated  TM  M.  GPSGs,  however, 
have  a  bounded  number  of  nonterminal  symbols,  and  as  discussed  above, 
that  is  the  essential  difference  between  CFGs  and  GPSGs. 

Only  weak  generative  power  is  of  interest  for  the  following  proof,  and  the 
formal  GPSG  constraints  on  weak  generative  power  are  trivially  abrogated. 

For  example,  exhaustive  constant  partial  ordering  (ECPO)  —  which  is  a 
constraint  on  strong  generative  capacity  —  can  be  done  away  with  for  all 
intents  and  purposes  by  nonterminal  renaming,  and  constraints  arising  from 
principles  of  universal  feature  instantiation  don’t  apply  to  fully  instantiated 
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ID  rules. 

First,  a  proof  that  “=  £*?”  is  undecidable  for  context-free  grammars 
with  a  very  small  number  of  terminal  and  nonterminal  symbols  is  sketched. 
Following  the  proof  for  CFGs,  the  equivalent  proof  for  GPSGs  is  outlined. 

2.1  Outline  of  a  Proof  for  Small  CFGs 

Let  L(i>y)  be  the  class  of  context-free  grammars  with  at  least  z  nonterminal 
and  y  terminal  symbols.  I  now  sketch  a  proof  that  it  is  undecidable  of 
an  arbitrary  CFG  G  €  L(*,y)  whether  L(G)  =  £*  for  some  x,y  greater 
than  fixed  lower  bounds.  The  actual  construction  details  are  of  no  obvious 
mathematical  or  pedagogical  interest,  and  will  not  be  included.  The  idea 
is  to  directly  construct  a  CFG  to  generate  the  invalid  computations  of  the 
Universal  Turing  Machine  (UTM).  This  grammar  will  be  small  if  the  UTM  is 
small.  The  “smallest  UTM”  of  Minsky(1967:276-281)  has  seven  states  and 
a  four  symbol  tape  alphabet,  for  a  state-symbol  product  of  28  (!).  Hence, 
it  is  not  surprising  that  the  “smallest  Gutm”  that  generates  the  invalid 
computations  of  the  UTM  has  seventeen  nonterminals  and  two  terminals. 

Observe  that  if  a  string  to  is  an  invalid  computation  of  the  universal  Tur¬ 
ing  machine  M  —  (Q,E,rtS,q0,B,F)  on  input  x,  then  one  of  the  following 
conditions  must  hold. 

1.  to  has  a  “syntactic  error,”  that  is,  to  is  not  of  the  form  xi#X2#  •  •  •  #zm#, 
where  each  x *  is  an  instantaneous  description  (ID)  of  M.  Therefore, 
some  i,-  is  not  an  ID  of  M. 

2.  x\  is  not  initial;  that  is,  x\  &  qoE* 

3.  xm  is  not  final;  that  is  xm  &  r*/T* 

4.  x,-  (x,+  i)fl  is  false  for  some  odd  i 

5.  (xj)fi  *-*m  ij+1  is  false  for  some  even  » 

Straightforward  construction  of  Gutm  will  result  in  a  CFG  containing  on 
the  order  of  twenty  or  thirty  nonterminals  and  at  least  fifteen  terminals  (one 
for  each  UTM  state  and  tape  symbol,  one  for  the  blank-tape  symbol,  and  one 
for  the  instantaneous  description  separator  “#”).  Then  the  subgrammars 
which  ensure  that  (x,)R  x,+i  is  false  for  some  even  i  and  that  z,-  >->m 
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in  the  construction  will  be  fully  specified  so  as  to  defeat  universal  feature 
instantiation,  and  the  construction  will  use  nonterminal  renaming  to  avoid 
ECPO. 

Let  the  GPSG  category  C  be  fully  specified  for  all  features  (the  actual 
values  don’t  matter)  with  the  exception  of,  say,  the  binary  features  GER, 
NEG,  NULL  and  POSS.  Arrange  those  four  features  in  some  canonical  order, 
and  let  binary  strings  of  length  four  represent  the  values  assigned  to  those 
features  in  a  given  category.  For  example,  C[0100]  represents  the  category  C 
with  the  additional  specifications  ([-GER]  ,  [+NEG]  ,  [-NULL],  [-P0SS]). 
We  replace  Sodd  by  C[0000],  Si  by  C[000l],  S2  by  C[0010],  S3  by  C[001l], 
Sg  by  C[0100],  and  S7  by  C[010l].  The  nonterminal  T  is  replaced  by  three 
symbols  of  the  form  C[llxx],  one  for  each  linear  precedence  to  which  T 
conforms.  Similarly,  E  is  replaced  by  two  symbols  of  the  form  C[l00x].  The 
ID  rules,  in  the  same  order  as  the  CF  productions  above  (with  a  portion  of 
the  necessary  LP  statements)  are: 

C[0000]  —  C[0001]# 

C[0001]  -»  C[1100]C[0001]C[1101]  I  CfOOlO]  I  C[0100]  I  C[0101] 

C[0100]  -*  C[1100]C[0100]  I  C(I100]C[0011] 

C[010lj  —  C[0101]C[1101]  I  C[001l]C[110lj 

C[0010]  —  C[l000]aC[1001]C[0Oll]C[1101]6C[lll0] 
where  aj^b,  both  in  E 

C[0010]  -»  ag6C[0011]{r3  -  pea}  if  S(q,b)  =  (p,c,  R) 
ag6C[001l]{r3  -  cap}  if  6{q,b)  =  (p,c,  L) 

C[0010]  — ►  aqB#B{ r3  -  pea}  if  6(q,  B )  =  (p,c,  R) 

aqB#B{Ts  -  cap}  if  B)  =  (p,  c,  L) 

<7(0011]  —  <7[1100]<7[0011]<7[1101]  | 

Q5#SC[1100]<7[1101]  | 

C[lOOO]B#BC(lI0Oj 

C[I100]  <  C[0001],C[00U],C[0100),C[0101]  <  <7(1101] 

C(1000]  <  a  <  <7(1001]  <  C[0011]  <  <7(1110] 

While  the  sketched  ID  rules  are  not  valid  GPSG  rules,  just  as  the 
sketched  context-free  productions  were  not  the  valid  components  of  a  context- 
free  grammar,  a  valid  GPSG  can  be  constructed  in  a  straightforward  and 
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obvious  manner  from  the  sketched  ID  rules.  There  would  be  no  metarules, 
FCRs  or  FSDs  in  the  actual  grammar. 

The  last  comment  to  be  made  is  that  in  the  actual  Gutm,  only  the 
number  of  productions  is  a  function  of  the  size  of  the  UTM.  The  UTM  is 
used  only  as  a  convincing  crutch,  because  only  a  small,  fixed  number  of 
nonterminals  are  needed  to  construct  a  CFG  for  the  invalid  computations  of 
any  arbitrary  Turing  Machine. 


3  Interpreting  the  Result 

The  preceding  pages  have  shown  that  the  extremely  simple  nonnatural  lan¬ 
guage  E*  is  generated  by  a  GPSG,  as  is  the  more  complex  language  Ljc 
consisting  of  the  invalid  computations  of  an  arbitrary  Turing  machine  on  an 
arbitrary  input.  Because  Lie  is  a  GPSG  language,  “=  E*?”  is  undecidable 
for  GPSGs:  there  is  no  algorithmic  way  of  knowing  whether  any  given  GPSG 
generates  a  natural  language  or  an  unnatural  one.  So,  for  example,  no  al¬ 
gorithm  can  tell  us  whether  the  English  GPSG  of  GKPS  really  generates 
English  or  E*.  • 

The  result  suggests  that  goals  1,  2,  3  and  the  context-free  framework 
conflict  with  each  other.  Weak  context-free  generative  power  allows  both 
E*  and  Lie,  Yet  by  goal  1  we  must  exclude  nonnatural  languages.  Goal  2 
demands  it  be  possible  to  algorithmically  determine  whether  a  given  GPSG 
generates  a  desired  language  or  not,  yet  this  cannot  be  done  in  the  context- 
free  framework.  Lastly,  goal  3  requires  that  all  nonnatural  languages  be 
excluded  on  the  basis  of  the  formal  system  alone,  but  this  looks  to  be  im¬ 
possible  given  the  other  two  goals,  the  adopted  framework,  and  the  technical 
vagueness  of  “natural  language  grammar.” 

The  problem  can  be  met  in  part  by  abandoning  the  context-free  frame¬ 
work.  Other  authors  have  argued  that  natural  language  is  not  context-free, 
and  here  we  argue  that  the  GPSG  theory  of  GKPS  can  characterize  context- 
free  languages  that  are  too  simple  or  trivial  to  be  natural,  e.g.  any  finite 
or  regular  language.6  The  context-free  framework  is  both  too  weak  and  too 

‘While  ‘natural  language  grammar’  is  not  defined  precisely,  recent  work  has  demon¬ 
strated  empirically  that  natural  language  is  not  context-free,  and  therefore  GPSG  theory 
will  not  be  able  to  characterize  all  the  human  language  grammars.  See,  for  example, 
Higginbotham(1984),  Shieber(1985),  and  Culy(1985).  For  counterarguments,  see  Pul- 
lura(1985).  Nash(1980),  chapter  5,  discusses  the  impossibility  of  accounting  for  free  word 
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strong  —  it  includes  nonnatural  languages  and  excludes  natural  ones.  More¬ 
over,  CFL’s  have  the  wrong  formal  properties  entirely:  natural  language  is 
surely  not  closed  under  union,  concatenation,  Kleene  closure,  substitution, 
or  intersection  with  regular  sets!7  In  short,  the  context-free  framework  is  the 
wrong  idea  completely,  and  this  is  to  be  expected :  why  should  the  arbitrary 
generative  power  classifications  of  mathematics  (formal  language  theory)  be 
at  all  relevant  to  biology  (human  language)? 

Goal  2,  that  the  naturalness  of  grammars  postulated  by  linguistic  the¬ 
ory  be  decidable,  and  to  a  lesser  extent  goal  3,  are  of  dubious  merit.  In 
my  view,  substantive  constraints  arising  from  psychology,  biology  or  even 
physics  may  be  freely  invoked,  with  a  corresponding  change  in  the  meaning 
of  “natural  language  grammar”  from  “mentally-representable  grammar”  to 
something  like  “easily  learnable  and  speakable  mentally-representable  gram¬ 
mar.”  There  is  no  a  priori  reason  or  empirical  evidence  to  suggest  that 
the  class  of  mentally  representable  grammars  is  not  fantastically  complex, 
maybe  not  even  decidable.8 

One  promising  restriction  in  this  regard,  which  if  properly  formulated 
would  alleviate  GPSG’s  actual  and  formal  inability  to  characterize  only  the 
natural  language  grammars,  is  strong  nativism  —  the  restrictive  theory  that 
the  class  of  natural  languages  is  finite.  This  restriction  is  well  motivated 
both  by  the  issues  raised  here  and  by  other  empirical  considerations.9  The 
restriction,  which  may  be  substantive  or  purely  formal,  is  a  formal  attack  on 
the  heart  of  the  result:  the  theory  of  undecidability  is  concerned  with  the 
existence  or  nonexistence  of  algorithms  for  solving  problems  with  an  infinity 


order  languages  (e.g.  Warlpiri)  using  ID/LP  grammars.  I  focus  on  the  goal  of  character¬ 
ising  only  the  natural  language  grammars  in  this  paper. 

TThe  finite,  bounded  number  of  nonterminals  allowed  in  GPSG  theory  plays  a  linguistic 
role  in  this  regard,  because  the  direct  consequence  of  finite  feature  closure  is  that  GPSG 
languages  are  not  truly  closed  under  union,  concatenation,  or  substitution. 

*See  Chomsky(1980:120)  for  a  discussion. 

“Note  that  invoking  finiteness  here  is  technically  different  from  hiding  intractability 
with  finiteness.  Finiteness  is  the  correct  generalization  here,  because  we  are  interested  in 
whether  GPSG  generates  nonnatural  languages  or  not,  and  not  in  the  computational  cost 
of  determining  the  generative  capacity  of  an  arbitrary  GPSG.  A  finiteness  restriction  for 
the  purposes  of  computational  complexity  is  invalid  because  it  prevents  us  from  properly 
using  the  tools  of  complexity  theory  to  study  the  computational  complexity  of  a  problem. 
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of  instances.  Furthermore,  the  restriction  may  be  empirically  plausible.10,11 

The  author  does  not  have  a  clear  idea  how  GPSG  might  be  restricted 
in  this  manner,  and  merely  suggests  strong  nativism  as  a  well-motivated 
direction  for  future  GPSG  research. 
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